单光子雪崩二极管(SPAD)在深度传感任务中越来越受欢迎。然而,由于堆积的效果,在高环境光线存在下仍然在努力挣扎。常规技术利用固定或异步门控以最小化堆积效果,但这些门控计划是所有非自适应的,因为它们无法将诸如场景前导者和之前的光子检测等因素结合到其门控策略中。我们提出了一种基于汤普森采样的自适应门控计划。自适应门控基于先前的光子观察周期性地更新栅极位置,以便最小化深度误差。我们的实验表明,即使在强大的阳光条件下在户外操作,我们的门控策略也会显着降低深度重建误差和采集时间。
translated by 谷歌翻译
We apply reinforcement learning (RL) to robotics. One of the drawbacks of traditional RL algorithms has been their poor sample efficiency. One approach to improve it is model-based RL. We learn a model of the environment, essentially its dynamics and reward function, use it to generate imaginary trajectories and backpropagate through them to update the policy, exploiting the differentiability of the model. Intuitively, learning more accurate models should lead to better performance. Recently, there has been growing interest in developing better deep neural network based dynamics models for physical systems, through better inductive biases. We focus on robotic systems undergoing rigid body motion. We compare two versions of our model-based RL algorithm, one which uses a standard deep neural network based dynamics model and the other which uses a much more accurate, physics-informed neural network based dynamics model. We show that, in environments that are not sensitive to initial conditions, model accuracy matters only to some extent, as numerical errors accumulate slowly. In these environments, both versions achieve similar average-return, while the physics-informed version achieves better sample efficiency. We show that, in environments that are sensitive to initial conditions, model accuracy matters a lot, as numerical errors accumulate fast. In these environments, the physics-informed version achieves significantly better average-return and sample efficiency. We show that, in challenging environments, where we need a lot of samples to learn, physics-informed model-based RL can achieve better asymptotic performance than model-free RL, by generating accurate imaginary data, which allows it to perform many more policy updates. In these environments, our physics-informed model-based RL approach achieves better average-return than Soft Actor-Critic, a SOTA model-free RL algorithm.
translated by 谷歌翻译
Very large language models such as GPT-3 have shown impressive performance across a wide variety of tasks, including text summarization. In this paper, we show that this strong performance extends to opinion summarization. We explore several pipeline methods for applying GPT-3 to summarize a large collection of user reviews in a zero-shot fashion, notably approaches based on recursive summarization and selecting salient content to summarize through supervised clustering or extraction. On two datasets, an aspect-oriented summarization dataset of hotel reviews and a generic summarization dataset of Amazon and Yelp reviews, we show that the GPT-3 models achieve very strong performance in human evaluation. We argue that standard evaluation metrics do not reflect this, and evaluate against several new measures targeting faithfulness, factuality, and genericity to contrast these different methods.
translated by 谷歌翻译
A central problem in computational biophysics is protein structure prediction, i.e., finding the optimal folding of a given amino acid sequence. This problem has been studied in a classical abstract model, the HP model, where the protein is modeled as a sequence of H (hydrophobic) and P (polar) amino acids on a lattice. The objective is to find conformations maximizing H-H contacts. It is known that even in this reduced setting, the problem is intractable (NP-hard). In this work, we apply deep reinforcement learning (DRL) to the two-dimensional HP model. We can obtain the conformations of best known energies for benchmark HP sequences with lengths from 20 to 50. Our DRL is based on a deep Q-network (DQN). We find that a DQN based on long short-term memory (LSTM) architecture greatly enhances the RL learning ability and significantly improves the search process. DRL can sample the state space efficiently, without the need of manual heuristics. Experimentally we show that it can find multiple distinct best-known solutions per trial. This study demonstrates the effectiveness of deep reinforcement learning in the HP model for protein folding.
translated by 谷歌翻译
Machine learning and deep learning-based decision making has become part of today's software. The goal of this work is to ensure that machine learning and deep learning-based systems are as trusted as traditional software. Traditional software is made dependable by following rigorous practice like static analysis, testing, debugging, verifying, and repairing throughout the development and maintenance life-cycle. Similarly for machine learning systems, we need to keep these models up to date so that their performance is not compromised. For this, current systems rely on scheduled re-training of these models as new data kicks in. In this work, we propose to measure the data drift that takes place when new data kicks in so that one can adaptively re-train the models whenever re-training is actually required irrespective of schedules. In addition to that, we generate various explanations at sentence level and dataset level to capture why a given payload text has drifted.
translated by 谷歌翻译
机器人感知模型,例如深神经网络(DNN),正在变得越来越强烈,并且有几种模型正在以准确性和延迟权衡进行培训。但是,现代的延迟准确性在很大程度上报告了单步视觉任务的平均准确性,但是几乎没有工作表明在机器人技术中为多步控制任务调用哪种模型。多步决策的主要挑战是在正确的时间使用正确的模型来完成给定的任务。也就是说,以最低控制成本和最小的感知时间完成任务是一项逃亡者。这被称为模型选择问题。在这项工作中,我们精确地解决了为多步控制的正确感知模型序列的问题。换句话说,我们通过将其作为多目标优化问题来平衡控制成本和感知时间,为模型选择问题提供了一种最佳的解决方案。从我们的解决方案中获得的关键见解是,感知模型的差异如何(不仅是平均准确性)对于多步决策制定,并展示如何使用多样化的感知模型作为节能机器人技术的原始性。此外,我们在AirSim中使用视觉导航进行了光真逼真的无人机着陆模拟的方法。使用我们提出的政策,我们的控制成本低38.04%,比其他竞争基准低79.1%。
translated by 谷歌翻译
端到端的口语理解(SLU)使用单个模型直接从音频中预测意图。它有望通过利用中间文本表示中丢失的声学信息来提高助手系统的性能,并防止自动语音识别(ASR)中的级联错误。此外,在部署助手系统时,拥有一个统一模型具有效率优势。但是,具有语义解析标签的公共音频数据集有限的数量阻碍了该领域的研究进展。在本文中,我们发布了以任务为导向的语义解析(Stop)数据集,该数据集是公开可用的最大,最复杂的SLU数据集。此外,我们定义了低资源拆分,以建立有限的标记数据时改善SLU的基准。此外,除了人类录制的音频外,我们还发布了TTS生成版本,以基于端到端SLU系统的低资源域适应性的性能。最初的实验表明,端到端SLU模型的性能比级联的同行差一些,我们希望这能鼓励未来的工作。
translated by 谷歌翻译
手术中的视觉问题回答(VQA)在很大程度上没有探索。专家外科医生稀缺,经常被临床和学术工作负载超负荷。这种超负荷通常会限制他们从患者,医学生或初级居民与手术程序有关的时间回答问卷。有时,学生和初级居民也不要在课堂上提出太多问题以减少干扰。尽管计算机辅助的模拟器和过去的手术程序记录已经可以让他们观察和提高技能,但他们仍然非常依靠医学专家来回答他们的问题。将手术VQA系统作为可靠的“第二意见”可以作为备份,并减轻医疗专家回答这些问题的负担。缺乏注释的医学数据和特定于域的术语的存在限制了对手术程序的VQA探索。在这项工作中,我们设计了一项外科VQA任务,该任务根据外科手术场景回答有关手术程序的问卷。扩展MICCAI内窥镜视觉挑战2018数据集和工作流识别数据集,我们介绍了两个具有分类和基于句子的答案的手术VQA数据集。为了执行手术VQA,我们采用视觉文本变压器模型。我们进一步介绍了一个基于MLP的剩余Visualbert编码器模型,该模型可以在视觉令牌和文本令牌之间进行相互作用,从而改善了基于分类的答案的性能。此外,我们研究了输入图像贴片数量和时间视觉特征对分类和基于句子的答案中模型性能的影响。
translated by 谷歌翻译
基于中心的聚类算法的最新进展通过隐式退火来打击贫穷的本地最小值,并使用一系列普遍的手段来打击。这些方法是劳埃德(Lloyd)著名的$ k $ -MEANS算法的变体,最适合于球形簇,例如由高斯数据引起的簇。在本文中,我们将这些算法的进步桥接为布雷格曼(Bregman)差异下的硬聚类的经典工作,这些工作享有指数级家庭分布的培养,因此非常适合由数据生成机制的广度引起的聚类对象。布雷格曼分歧的优雅特性使我们能够以简单透明的算法维护封闭的表单更新,此外,还引发了新的理论论点,以建立有限的样本范围,以放松在现有的艺术状态下做出的有限支持假设。此外,我们考虑对模拟实验进行彻底的经验分析和降雨数据的案例研究,发现所提出的方法在各种非高斯数据设置中都优于现有的同行方法。
translated by 谷歌翻译
使用敏感用户数据调用隐私保护方法,执行低排名矩阵完成。在这项工作中,我们提出了一种新型的噪声添加机制,用于保留差异隐私,其中噪声分布受Huber损失的启发,Huber损失是众所周知的稳定统计数据中众所周知的损失功能。在使用交替的最小二乘方法来解决矩阵完成问题的同时,对现有的差异隐私机制进行了评估。我们还建议使用迭代重新加权的最小二乘算法来完成低级矩阵,并研究合成和真实数据集中不同噪声机制的性能。我们证明所提出的机制实现了{\ epsilon} - 差异性隐私,类似于拉普拉斯机制。此外,经验结果表明,在某些情况下,Huber机制优于Laplacian和Gaussian,否则是可比的。
translated by 谷歌翻译